WILEY.MICROSOFT.BIG.DATA.SOLUTIONS.2014 by John Welch & Dan Clark & Christopher Price & Brian Mitchell

WILEY.MICROSOFT.BIG.DATA.SOLUTIONS.2014 by John Welch & Dan Clark & Christopher Price & Brian Mitchell

Author:John Welch & Dan Clark & Christopher Price & Brian Mitchell
Format: epub
ISBN: 9781118729557
Published: 0101-01-01T00:00:00+00:00


Using Hive

Another tool available to create and run map-reduce jobs in Hadoop is Hive. One of the major advantages of Hive is that it creates a relational database layer over the data files. Using this paradigm, you can work with the data using traditional querying techniques, which is very beneficial if you have a SQL background. In addition, you do not have to worry about how the query is translated into the map-reduce job. There is a query engine that works out the details of what is the most efficient way of loading and aggregating the data.

In the following sections you will gain an understanding of how to perform advanced data analysis with Hive. First you will look at the different types of built-in Hive functions available. Next, you will see how to extend Hive with custom map-reduce scripts written in Python. Then you will go one step further and create a UDF to extend the functionality of Hive.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.